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CORRECTING PARTIAL, MULTIPLE, AND CANONICAL CORRELATIONS FOR ATTENUATION 



Introduction 



In dealing with multivariate correlational techniques and fallible 
data, one is faced with the same difficulties that have been pointed out 
for the product -moment correlation (Finucci, 1970) and other related 
measures of association (Stanley and Livingston, 1970). Correlations based 
on fallible variables will result in values which are underestimates of 
the correlations among the true parts of the variables. (The truth of this 
statement for all multivariate situations has not been proven analytically, 
but Cochran's (1970) work regarding multiple correlation suggests that the 
statement does, in fact, hold true.) Investigators have been inclined to 
ignore the problems of unreliability, being content with fallible under- 
estimates of the true relationship and avoiding "questionable 11 correction 
for attenuation procedures. However, such an approach ignores useful in- 
formation. In addition to providing a means for obtaining estimates of 
true score correlations, correction for attenuation formulas facilitate 
understanding of the effects of unreliability on the results. Information 
of this type is useful, for example, in deciding how much could be gained by 
expending time and money to develop more reliable measurement. 

Size of the correlation coefficient is not the only concern for research 
involving multivariate measurement. One is often more concerned with the 
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contribution of each individual variable to the overall result. For ex- 
ample, a canonical correlation analysis rarely stops with the coefficient 
itself. The relative sizes of the weighting factors for each of the 
variables in the canonical variates are vital for interpretation. Errors 
of measurement attenuate these weighting factors as well as the overall 
correlation, making the interpretation of canonical correlations and vari- 
ates computed from fallible data a questionable, or at best difficult, un- 
dertaking, This same point holds true for multiple correlation and the use 
to which it is put. 

The purpose of the present paper is to give the correction for attenua- 
tion formulae for partial, multiple, and canonical correlation coefficients 
and to discuss, where known, the effects of measurement error on these sta- 
tistics, Most of the formulas presented have been derived elsewhere in the 
literature, I have simply standardized the notation and extended some of 
the derivation where appropriate. 

The Partial Correlation Corrected for Attenuation 

First, let us consider the first-order partial correlation coefficient. 

Suppose we have three variables x^ , yL^> and x^ which are fallible measures 

of, say, alienation, school achievement, and I.Q. and want to know the true 

correlation between alienation and school achievement, controlling for I.Q, 

We begin by defining the variable to be the sum of its true score, t^, 

and errors of measurement, g (see (1) on list of formulae). We assume that 

CTT * 0, - 0, and 0^ = 0, That is, we begin with the classi- 

ii i i i j 

cal test theory model and the classical test theory assumption. 



The partial correlation between alienation and achievement, controlling 
for I.Q., is defined as the zero-order correlation of residuals. The resi- 
duals for alienation are given by the difference between the observed values 
and the regression estimates of alienation from I.Q. The residuals are 
represented symbolically in equations (2) and (3). It is well-known that 
the correlation of residuals can be expressed in terms of the three zero- 
order correlations. The formula is given by (4), which is the partial r 
based on fallible variates. The partial correlation coefficient, corrected 
for attenuation, would yield the partial correlation of true score, i.e., the 
correlation of true score residuals. We can obtain the correction for attua- 
tion formula by starting with the correlation of true score residuals and 
working backwards. 

The true score residuals are defined as the difference between the true 
value and the estimated true value based on a regression of the variable 
(t^ or t^) on the true value of the control variable (t^) and are given in for- 
mulas (5) and (6). The partial correlation of t. and t 0 controlling for t~ 
is then given by (7). Expanding numerator and denominator, we can use some 
of the well-known properties of classical test theory to express the true 
partial correlation in terms of the fallible zero-order correlations and 
reliabilities. The result, given in (11) is the correction for attenuation 
formula for a first-order partial correlation. It should be noted that the 
formula is equivalent to correcting each of the zero-order correlations for 
attenuation by the usual way and plugging these values into (4). (See 
Livingston and Stanley, 1970.) 
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Bohrnstedt (1969) derived a formula for correcting partial correlations 
for attenuation due to errors of measurement which is similar to (11), but 
does not contain the terms Upon examining his derivation, it was 

apparent that he was correcting only for errors of measurement in the control 
variable x^ . In effect, he had provided the formula for a partially corrected 
partial correlation coefficient. On the basis of his formula, Borhnstedt in- 
dicates that it is possible for the corrected partial correlation to be less 
than the obtained partial correlation. This does not seem to be the case 
however, and it appears that correcting for attenuation will result in larger 
values. Since — 1 , the numerator of (11) will be less than or equal to 

that of (4), This would tend to decrease the value of the corrected partial 
correlation. On the other hand, the denominator of (11), being less than 

that of (4), would tend to increase v The relative size of the 

' J * h • *3 * \ 

numerator to the denominator in (11) would seem to result in an overall 
increase in the partial correlation, when corrected for attenuation* (A few 

numbers I have plugged into the equations indicate such a trend, though I 

\ 

have no analytic proof of this statement,) 

\ s 

A General Approach to Multivariate Corrections for Attenuation 

\ 

Meredith (1964) has developed a more general approach to correction for 
attenuation problems which he has applied to the canonical correlation problem. 
His result can be readily applied to problems involving partial and multiple 
correlation. We begin with a variance-covariance matrix, ran ^ p + q, 

where p + q is the number of variables being fallibly measured. Under the 
assumption that the classical test theory model is appropriate for each of 
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as the sum of two matrices, 



the p + q variabes we can write the matrix /L» x 

. the variance-covariance matrix among true scores, and the 

variance-covariance matrix among the errors of measurement (equation 12) . 
Assuming errors of measurement to covary zero with each other the matrix 
y f is a diagonal matrix of the variance errors of estimates. We can ob- 
tain by subtraction (equation 13). Given z: the matrix of true 

score variances and covariances, it is a simple matter to obtain the matrix 
of true score correlations by dividing each element by the square root of 
the product of the appropriate variances. These operations are shown in 
matrix notation in equation (14). 
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It is important to note here that (14) is equivalent to correcting 
each of the zero-order correlations in P^, the matrix of fallible correla- 
tions, for attenuation in the usual manner. That such is the case becomes 
clear if we consider each of the p + q variables to have mean=zero and 
variance=one . Under these conditions = ? x anc * 2? e *- s a diagonal matrix 
of alienation coefficients. Thus, the matrix of (13) is the matrix of 
fallible intercorrelations (P ) with reliabilities on the diagonal, which 
is the true-score variance-covariance matrix of standard deviates. The 
operations shown in (14) now involve dividing every correlation in by 
the square root of the product of the reliabilities for the appropriate 
variables, which is the zero-order correction for attenuation procedure. 

So far, the discussion has been in terms of population values. Merredith 

has pointed out that a maximum likelihood estimate of X?. and thus of P can 

t t 

be obtained from S x , the sample variance-corvariance matrix, if the reliabilities 
of the measures are known (equations 15 and 16). Though the remainder of 

the paper continues to use the population values, one can easily substitute 

A 

P fc under the above restriction. 
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A general procedure for correcting multivariate correlations for atten- 
uation involves the following two steps. First, correct each of the zero- 
order correlations for attenuation in the usual way to obtain P t# Second, 
calculate the desired statistic from P . 



Let us return to the problem of partial correlations. Suppose that we 
were interested in obtaining the true score correlations among a set of p 
variates controlling for true scores on a second set of q variables. We 

could solve the problem by first obtaining P (or, more likely, its estimates, 

/\ 

P fc ), partitioning P as shown in (17), and using the matrix solution for 
partial correlations (Anderson, 1958, and Morrison, 1967) shown in (18). If 



q = 1, P. 



1 .2 



is a p x p matrix of first order partials whose off-diagonal 



elements are of the form given in (4). Extending our three-variabe example, 
P t i o could be a matrix, of attitude-achievement true score correlations, 
controlling for I.Q. 



The multiple correlation problem involves finding the maximum corre- 
lations R between a single criterion and a linear combination of, say, 

2 

p predictors. The matrix solution for R is given in (19) (See Anderson, 
1958 or Morrison, 1967). The multiple correlation between the true scores 
of the p predictors and the criterion could be obtained by substituting 
the corresponding true score correlation matrices of (17) into (19), re- 
sulting in equation (20). 

In the above situation, any of the p + 1 variables could be desig- 
nated as the criterion by simply interchanging the appropriate rows and 
columns of P t * A general formula for the squared multiple correlation co- 
efficient (SCM) of each of the i variates with the remaining q variates 

6 
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corrected for attenuation is given by (21), where I is a p + 1 identity 
matrix and D indicates diagonals of the matrices given in parentheses . 

A connection can be made here with factor analysis. It is common practice 
to factor a matrix of the form [p - D (P *)J , which is the case where 

the SCM coefficient is used as an estimate of communality. (However, Harris 
(1964) indicates that such a procedure does not represent "true" factor 
analysis) • 



The last statistic we shall discuss is the canonical correlation co- 
efficient (Hotelling, 1936). Canonical correlation is a generalization 
of the concept of multiple correlation to the case of multiple criteria 
(q ^ 1) as well as multiple predictors (p > 1). The objective in such an 
analysis is to find the maximum correlation between a linear composite of 
the predictors and a linear composite of the criteria. Though Hotelling was 
primarily concerned with the largest correlation between these composites, 
there are k = min (p, q) possible independent correlations. The k canonical 
correlations for any given set of p predictors and q criteria are given by 
the roots of the determinantal equation given in (22) . If the true- score 
correlation matrix of (17) is used, you would have the k canonical corre- 
lations corrected for attenuation. For completeness, the formulas for the 
weighting vectors to form the linear composite of the criterion variates 
( in 24) and the linear composite of predictor variabes ( in 25) 

are also given. The formulas would provide either the fallible weighting 
vectors or the true-scorr weights, depending on which correlation matrices 
were used* 
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Effects of Errors of Measurement 



i 

In the introduction it was pointed out thatn the most well-known effect 



V 

of errors of measurement is to produce a statistic which is an underestimate 

| 

of the true value. For example, Cochran (1970) hjjas shown that for a number 
of situations involving multiple correlation, a gjLod estimate of attenua- 

jiough the actual value of 

may run 25% higher than this value for posit 
tor reliability = 0.5) 



ting effects of fallible data is given by (22), t 

R] 



.ve 






and low predic- 



A second problem raised when errors of measurement are present is the 
valid interpretation of results for multivariate [correlations* In multiple 

f 

and canonical correlations studies an important Objective is to discover the 

I 

relative importance of the predictor and criterion variables. The inter- 



correlations among these variables and their uiuj[e liability can interact to 



produce misleading results. An example from Cojihran (1970) illustrates this 
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point. 



A common practice in the application of multiple correlation (especially 

I 2 

among sociologists) is to partition the predicted variance (R ) into portions 

I 

uniquely attributable to each predictor and thjje portion of common variance 



predicted. Unreliability can have a substantial effect on the results of 



such an analysis. Consider the 2-predictor c;|se with //J '/&* 



and no error of measurement: 
R 3 . 12 = ,385 



% of variance unique to x^ * 13.5 

fi 

% of variance unique to = 13.5 
% of variance common to both = 11.5 



With the reliability of variable 
1 equal to .8; i.e. = 

“ d /^22 = 

R 3 . 12 = * 356 



i fch = 0.6 and 



With 
*3.12 ' - 328 



= l 



7o of variance unique to x^ = 10,6 
7o of variance unique to x^ = 15*6 
% of variance common to both =9.4 



7o of variance unique to x^ = 7.8 
% of variance unique to x^ = 17.8 
% of variance common to both =7.2 



In the above example we see that as the reliabilities of the pre- 
dictors become more disparate, the true contributions of each variable 
becomes more distorted. This effect can be best understood when you 

consider what would happen if were removed from the correlation# 

2 

R 3 2 = and P ercent of variance unique to x^ would be 25. Unrelia- 
bility in one of the variables takes part of that variable "out" of the 
prediction, shifting predicted variance to the more reliable predictors. 

The change in the importance of predictors in multiple correlation caused by 
deletion of one of the variables has been referred to as the "bouncing 
betas." Difference in the reliabilities adds more bounce to these results. 
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Formulas for Paper 

Correcting Partial, Multiple, and Canonical Correlation for Attenuation 
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(1) 


*i = *i + e ± 


(2) 


*1.3 = *1 - B : 


(3) 


*2.3 = *2 - 



(for i = 1, 2, 3) 



(4) p 



13 



'1*3 



x l. 3*2.3 



P “ P P 
*1*2 *1*3 *2*3 



x i X2 x 3 p. ~2 / (1 - p 2 )(1 - p 2 ) 

/ *1.3 *2.3 / *1*3 *2*3 



(5) t 1.3 = t l ~ 6 t 1 t 3 t 3 

( 6 > t 23 = t 2 - e t 2 t 3 t 3 



(7) p 



t 1.3 t 2.3 



(t l " e t 1 t 2 t 3 )(t 2 " e t 2 t 3 t 3 ) 



tlt2 ’ t3 73 . ~2 71 ~2 

A.3 *2.3 / <*i ’ (t 2 “ 



13 3 

Derivation of correction formula for partial correlation coefficient. 

Expanding the numerator of (7) 

a = a -3a -3a +3 3 a 2 

t 1.3 t 2.3 t l t 2 t l t 3 C 2 C 3 C 2 C 3 *1*3 *1*3 *2*3 *3 



a a p 

*i VVs 



* CT_ CT^ p_ _ - 


p 


a a p^ 


P 


*1 t 2 t l t 2 


Wt, 

J | 


t 2 *3 t 2 t 3 


*2*3% 



(cont.) 

* 

Presented at the American Educational Research Association 55th Annual 
Meeting, New York City, February 4-7, 1971. 
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er|c 



. t l t 3 a t 
J Z 3 



,P Wt3 °*3| 



(The last two terms are equivalent except for sign, and thus they sum to zero.) 



a a p - p aap 

t t K t t K t t t t p t t 

C 1 2 C 1 C 2 C 1 C 3 C 1 C 2 C 2 3 



■/ c n c 22°*.V 



* 1*2 



X.X, 



v l*3 i . — x o x 3 

c. X ?/ -— — ’ 

1 y p ll p 22 / P 11 P 33\ 1 |/ p 22 p 33 

Therefore, where p^ is the reliability coefficient of variable x^ 



(8) a t t sa x°x 
1.32.3 1 2 



p,.p - p p 

33 x^Xj x l x 3 X 2 X 3 



33 



/ 



Expanding the first term in the denominator of (7) one secures 

a 2 = a 2 + g 2 a 2 - 2g a 

*1.3 C i *1*3 C 3 *1*3 *1*3 

2 2 2 2 

= a + p a - 2p a p 

t* ^tt t t 

L 1 13 1 t l t 3 1 C 1 3 

= a 2 -p 2 a 2 

C 1 P *l*3 C 1 



= p li a x. 



11 

P 11 P 33 



Therefore, 



(9) a 2 =a 2 
1.3 1 



p n P 



33 ~ Px l x 3 ) 



33 



Similarly, it can be shown that 



(10) a 2 = a 2 

c 2. 3 x 2 



f°22 p 33 - 



33 



Substituting (8), (9), and (10) into (7) and simplifying we get 



13 



3 



(ID P 



p._p - p p 

33 XjXj X 2X3 



V2 ' t3 yfi p 33 - \* 2 )J(°^ - ^ 



(12) I*-^-**, 

(13) Z T - Z x - Z £ 



(14) P T = D(Z t )" 1/2 Z t D(Z t )" 1/2 



as ) ^ T = s x - z E 



(16) P T = D(Z t )' 1/2 Z t D(Z t )" 1/2 







1 P_ 




T 

11 


1 T 12 




P m 






T 

A 21 


» 

| a 22 



(18) P - P T - P_ Pi 1 P_ 

1.2 11 a 12 a 22 21 



(19) rJ - P Y pZ 1 P y 

21 *11 X 12 

(20) r£ - P P' 1 P 

21 A 11 a 12 

(21) D(R^) = I - D(P^) 



( 22 ) 



P Y P" 1 P„ - XP Y 
*21 *11 *12 *22 



(23) |p pZ 1 p - xp t I = o 
21 A 11 a 12 a 22 



(24) 



Ki Pl U Pl 12 ' X± \} ?1 ' 5 



(25) b - -^fZ 1 P a. 

- 1 ^ T n T i2~ i 



ERjt 



14 



4 



q-1 
i=l 



.2 _ i=l ^ 11 _2. ~ 



(26) = R T . P qq q _ 1 = R T P qq P ii 

E P iq 
i=l iq 



where p is reliability of the criterion 

p^ is reliability of the ith predictor 



p. is correlation between criterion and ith predictor. 
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